What is Claimed is: 



1 . A method of modifying HMM models trained on clean speech with cepstral mean 
normalization to provide models that compensate for simultaneous channel/microphone 

5 distortion and background noise (additive distortion) comprising the steps of: 

for each speech utterance calculating the mean mel-scaled cepstrum coefficients 
(MFCC) vector b over the clean database; 

10 adding the mean MFCC vector b to the mean vectors m pj>k of the original HMM 

O models where p is the index of PDF, j is the state, and k the mixing component to get in m pJ/k ; 

jf for a given speech utterance calculating an estimate of the background noise 

4= vector X ; 

& 

^ calculating the model mean vectors adapted to the noise X using m PJ \k = IDFT 

Li (DFT ( m pjfk ® DFT (X )) to get the noise compensated mean vector where the Inverse Discrete 

O Fourier Transform is taken sum of the Discrete Fourier Transform of the mean vectors m p j t k 

ill 

modified by the mean MFCC vector b added to the Discrete Fourier Transform of the estimated 
20 noise X ; and 

calculating the mean vector b of the noisy data over the noisy speech space, and 
removing the mean vector b of the noisy data from the model mean vectors adapted to noise to 
get the target model. 

25 

2. The method of Claim 1 wherein the step of calculating the mean vector b of the 
noisy data over the noisy speech space will calculate the vector using statistics of noisy model 

using :b= X X Z P M P 3\, (/I P)PxLs (k\pj)m pj jt where is the variable 

P J k 
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denoting PDF Index J is the variable for the sate index and % is the variable for mixing 
component index 

3. The method of Claim 2 wherein said calculating the mean vector b uses equal 
probabilities for Pjc(p) 

5 

Px(p) = C. 

4. The method of Claim 2 wherein equal probabilities for iV (p), Pj | , P(J \p) and P%\ „ 
j(k\h,j)is used. 

10 

O Px(p) = C 

jyj pj\ j{ (j\p)=d 

S P K \ H ,j(k\p,j)=E 

f5 5 . The method of Claim 3 wherein mean vector b becomes equal to: 

5 b = IDFT (DFT(b) 0 DFT ( X )). 

O 6. A method of speech recognition with compensation for channel distortion and 

background noise comprising the steps of: 
20 providing HMM models trained on clean speech with cepstral mean normalization; 

for each utterance: 

calculating the calculating the mean mel-scaled cepstrum coefficients (MFCC) vector b 
over the clean database; 

25 adding the mean MFCC vector b to the mean vectors m pJ>k of the original HMM 

models where p is the index of PDF, j is the state, and k the mixing component to get in m pJik ; 

for a given speech utterance calculating an estimate of the background noise 

vector X ; 
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calculating the model mean vectors adapted to the noise X using m w * = IDFT 

(DFT ( 

® DFT (X )) to get the noise compensated mean vector where the Inverse Discrete 
Fourier Transform is taken sum of the Discrete Fourier Transform of the mean vectors m p j r k 
modified by the mean MFCC vector b added to the Discrete Fourier Transform of the estimated 
5 noise X ; and 

calculating the mean vector b of the noisy data over the noisy speech space, and 

A 

removing the mean vector b of the noisy data from the model mean vectors adapted to noise to 
get the target model;and 

10 comparing the target model to the speech input utterance to recognize speech. 

H 7. The method of Claim 6 wherein the step of calculating the mean vector b 

p of the noisy data over the noisy speech space will calculate the vector using statistics of noisy 

P model using :b = X Z X P >&) P 3\> 0 ' p) p k\„j (k\pj) rh P j,k where is the 

ff! p J k 

% variable denoting PDF Index J is the variable for the sate index and T^is the variable for mixing 

15 component index 

O 

M= 8. The method of Claim 7 wherein said calculating the mean vector b uses 

equal probabilities for Pxip) 

Px(p) = C. 

20 9. The method of Claim 7 wherein equal probabilities for Px(p), Pj | , P(j \p) 

and P x \ „ h,j)is used. 

Px(p) = C 
Pj\x(j\p)=D 

25 P K \H,j(k\pJ) =E 

A 

10. The method of Claim 9 wherein mean vector b becomes equal to: 
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b = IDFT (DFT(b) ® DFT ( X )). 



11. A speech recognizer with compensation for channel distortion and 
background noise comprising in combination: 

5 adapted HMM models generated by modifying HMM models trained on clean speech 

with cepstral mean normalization wherein said models are adapted by: 
for each utterance: 

calculating the calculating the mean mel-scaled cepstrum coefficients (MFCC) vector b 
over the clean database; 

10 

;Z adding the mean MFCC vector b to the mean vectors m pJrk of the original HMM 

D models where p is the index of PDF, j is the state, and k the mixing component to get in m PJr k \ 

% for a given speech utterance calculating an estimate of the background noise 

M vector X ; 

p calculating the model mean vectors adapted to the noise X using m p j >k = IDFT 

M= (DFT (m p j,k ® DFT (X )) to get the noise compensated mean vector where the Inverse Discrete 

p Fourier Transform is taken sum of the Discrete Fourier Transform of the mean vectors m p j f k 

PI I A 

modified by the mean MFCC vector b added to the Discrete Fourier Transform of the estimated 
20 noise X ; and 

calculating the mean vector b of the noisy data over the noisy speech space, and 

removing the mean vector b of the noisy data from the model mean vectors adapted to noise to 
get the adapted model;and 
25 means for comparing the adapted model to the speech input utterance to recognize 

the input speech 

12. The recognizer of Claim 11 wherein the step of calculating the mean 
vector b of the noisy data over the noisy speech space will calculate the vector using statistics of 
noisy model using :b = £ X p ,(P) p n\> ^\ PW%\»3 (k\pj)m pJ , k where <H is the 

P j & 
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variable denoting PDF Index J is the variable for the sate index and ?Cis the variable for mixing 

component index 6. The model of Claim 5 wherein the step of calculating the mean vector b 
of the noisy data over the noisy speech space will calculate the vector using statistics of noisy 

model using :b - ^ I I p ,(p)Pj\, (/I P) p %\» J (k\p f j)fa pj>k where Of is the 

p j k 

variable denoting PDF Index J is the variable for the sate index and ?Cis the variable for mixing 
component index 

13. The recognizer of Claim 12 wherein said calculating the mean vector b 
uses equal probabilities for Px(p) 

/Mp) = c. 

14. The recognizer of Claim 12 wherein equal probabilities for Px{p\ Pj I, 
P(j \p) and P%\ m3 (k\ hj)is used. 

Px(p) = C 

Pj\x(j\p) = V 
p K\H,j(k\pJ) = E 

1 5 . The method of Claim 1 2 wherein mean vector b becomes equal to: 

b = IDFT (DFT(b) ® DFT ( X )). 

16. A method of speech recognition with simultaneous compensation for both 
channel/micriphone distortion and background noise comprising the steps of: 

modifying HMM models trained on clean speech with cepstral mean normalization; 

for each spech utterance calculating the MFCC vector for a clean database; 

adding this mean MFCC vector to the original HMM models; 

estimating the background noise for a given speech utterance; 

determining the model mean vectors adapted to the noise; 

determining the mean vector of the noisy data over the noisy speech space;and 

removing the mean vector of the noisy data over the noisy speech space from the model 

mean vectors adapted to the noise to get the target model. 

17. A method of speech comprising the steps of: 
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providingHMM models trained on clean speech with cepstral mean 

normalization; and 

modifying HMM models to compensate silutaneously for convolutive 
distortion and background noise. 
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